Search CORE

618 research outputs found

Bootstrap Inference for Multiple Imputation under Uncongeniality and Misspecification

Author: Bartlett Jonathan W.
Hughes Rachael A.
Publication venue: 'SAGE Publications'
Publication date: 27/11/2019
Field of study

Multiple imputation has become one of the most popular approaches for handling missing data in statistical analyses. Part of this success is due to Rubin's simple combination rules. These give frequentist valid inferences when the imputation and analysis procedures are so called congenial and the complete data analysis is valid, but otherwise may not. Roughly speaking, congeniality corresponds to whether the imputation and analysis models make different assumptions about the data. In practice imputation and analysis procedures are often not congenial, such that tests may not have the correct size and confidence interval coverage deviates from the advertised level. We examine a number of recent proposals which combine bootstrapping with multiple imputation, and determine which are valid under uncongeniality and model misspecification. Imputation followed by bootstrapping generally does not result in valid variance estimates under uncongeniality or misspecification, whereas bootstrapping followed by imputation does. We recommend a particular computationally efficient variant of bootstrapping followed by imputation.Comment: Updated (fixed) reference based simulation results. Now included tables which were previously not included as they were in supplementary information document. Swapped order of the two simulation studies. Added acknowledgement and funding statement

arXiv.org e-Print Archive

LSHTM Research Online

Explore Bristol Research

Estimation of the linear mixed integrated Ornstein-Uhlenbeck model.

Author: Hughes Rachael A
Kenward Michael G
Sterne Jonathan AC
Tilling Kate
Publication venue: 'Informa UK Limited'
Publication date: 12/01/2017
Field of study

The linear mixed model with an added integrated Ornstein-Uhlenbeck (IOU) process (linear mixed IOU model) allows for serial correlation and estimation of the degree of derivative tracking. It is rarely used, partly due to the lack of available software. We implemented the linear mixed IOU model in Stata and using simulations we assessed the feasibility of fitting the model by restricted maximum likelihood when applied to balanced and unbalanced data. We compared different (1) optimization algorithms, (2) parameterizations of the IOU process, (3) data structures and (4) random-effects structures. Fitting the model was practical and feasible when applied to large and moderately sized balanced datasets (20,000 and 500 observations), and large unbalanced datasets with (non-informative) dropout and intermittent missingness. Analysis of a real dataset showed that the linear mixed IOU model was a better fit to the data than the standard linear mixed model (i.e. independent within-subject errors with constant variance)

LSHTM Research Online

Explore Bristol Research

Recommended from our members

Relative efficiency of joint-model and full-conditional-specification multiple imputation when conditional models are compatible: The general location model.

Author: Hughes Rachael A
Seaman Shaun R
Publication venue: Stat Methods Med Res
Publication date: 01/06/2018
Field of study

Estimating the parameters of a regression model of interest is complicated by missing data on the variables in that model. Multiple imputation is commonly used to handle these missing data. Joint model multiple imputation and full-conditional specification multiple imputation are known to yield imputed data with the same asymptotic distribution when the conditional models of full-conditional specification are compatible with that joint model. We show that this asymptotic equivalence of imputation distributions does not imply that joint model multiple imputation and full-conditional specification multiple imputation will also yield asymptotically equally efficient inference about the parameters of the model of interest, nor that they will be equally robust to misspecification of the joint model. When the conditional models used by full-conditional specification multiple imputation are linear, logistic and multinomial regressions, these are compatible with a restricted general location joint model. We show that multiple imputation using the restricted general location joint model can be substantially more asymptotically efficient than full-conditional specification multiple imputation, but this typically requires very strong associations between variables. When associations are weaker, the efficiency gain is small. Moreover, full-conditional specification multiple imputation is shown to be potentially much more robust than joint model multiple imputation using the restricted general location model to mispecification of that model when there is substantial missingness in the outcome variable

Apollo (Cambridge)

Explore Bristol Research

Joint modelling rationale for chained equations.

Author: Carpenter James R
Hughes Rachael A
Seaman Shaun R
Sterne Jonathan AC
Tilling Kate
White Ian R
Publication venue: BMC Med Res Methodol
Publication date: 21/02/2014
Field of study

BACKGROUND: Chained equations imputation is widely used in medical research. It uses a set of conditional models, so is more flexible than joint modelling imputation for the imputation of different types of variables (e.g. binary, ordinal or unordered categorical). However, chained equations imputation does not correspond to drawing from a joint distribution when the conditional models are incompatible. Concurrently with our work, other authors have shown the equivalence of the two imputation methods in finite samples. METHODS: Taking a different approach, we prove, in finite samples, sufficient conditions for chained equations and joint modelling to yield imputations from the same predictive distribution. Further, we apply this proof in four specific cases and conduct a simulation study which explores the consequences when the conditional models are compatible but the conditions otherwise are not satisfied. RESULTS: We provide an additional "non-informative margins" condition which, together with compatibility, is sufficient. We show that the non-informative margins condition is not satisfied, despite compatible conditional models, in a situation as simple as two continuous variables and one binary variable. Our simulation study demonstrates that as a consequence of this violation order effects can occur; that is, systematic differences depending upon the ordering of the variables in the chained equations algorithm. However, the order effects appear to be small, especially when associations between variables are weak. CONCLUSIONS: Since chained equations is typically used in medical research for datasets with different types of variables, researchers must be aware that order effects are likely to be ubiquitous, but our results suggest they may be small enough to be negligible

LSHTM Research Online

Springer - Publisher Connector

PubMed Central

Apollo (Cambridge)

Explore Bristol Research

Accounting for missing data in statistical analyses:multiple imputation is not always the answer

Author: Heron Jon
Hughes Rachael A.
Sterne Jonathan A.C.
Tilling Kate
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/08/2019
Field of study

Explore Bristol Research

Selection bias when estimating average treatment effects using one-sample instrumental variable analysis

Author: Davey Smith George
Davies Neil M.
Hughes Rachael A.
Tilling Kate
Publication venue: 'Ovid Technologies (Wolters Kluwer Health)'
Publication date: 01/05/2019
Field of study

Crossref

Explore Bristol Research

Association of invasion-promoting tenascin-C additional domains with breast cancers in young women

Author: Guttery David S
Hancox Rachael A
Hughes Simon
Jones J Louise
Lambe Sinead M
Mulligan Kellie T
Pringle J Howard
Shaw Jacqueline A
Walker Rosemary A
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Introduction: Tenascin-C (TNC) is a large extracellular matrix glycoprotein that shows prominent stromal expression in many solid tumours. The profile of isoforms expressed differs between cancers and normal breast, with the two additional domains AD1 and AD2 considered to be tumour associated. The aim of the present study was to investigate expression of AD1 and AD2 in normal, benign and malignant breast tissue to determine their relationship with tumour characteristics and to perform in vitro functional assays to investigate the role of AD1 in tumour cell invasion and growth. Methods: Expression of AD1 and AD2 was related to hypoxanthine phosphoribosyltransferase 1 as a housekeeping gene in breast tissue using quantitative RT-PCR, and the results were related to clinicopathological features of the tumours. Constructs overexpressing an AD1-containing isoform (TNC-14/AD1/16) were transiently transfected into breast carcinoma cell lines (MCF-7, T-47 D, ZR-75-1, MDA-MB-231 and GI-101) to assess the effect in vitro on invasion and growth. Statistical analysis was performed using a nonparametric Mann-Whitney test for comparison of clinicopathological features with levels of TNC expression and using Jonckheere-Terpstra trend analysis for association of expression with tumour grade. Results: Quantitative RT-PCR detected AD1 and AD2 mRNA expression in 34.9% and 23.1% of 134 invasive breast carcinomas, respectively. AD1 mRNA was localised by in situ hybridisation to tumour epithelial cells, and more predominantly to myoepithelium around associated normal breast ducts. Although not tumour specific, AD1 and AD2 expression was significantly more frequent in carcinomas in younger women (age ≤40 years; P < 0.001) and AD1 expression was also associated with oestrogen receptor-negative and grade 3 tumours (P < 0.05). AD1 was found to be incorporated into a tumour-specific isoform, not detected in normal tissues. Overexpression of the TNC-14/AD1/16 isoform significantly enhanced tumour cell invasion (P < 0.01) and growth (P < 0.01) over base levels. Conclusions: Together these data suggest a highly significant association between AD-containing TNC isoforms and breast cancers in younger women (age ≤40 years), which may have important functional significance in vivo

Springer - Publisher Connector

PubMed Central

Leicester Research Archive

Perceptions of European ME/CFS experts concerning knowledge and understanding of ME/CFS among primary care physicians in Europe : A report from the European ME/CFS research network (EUROMENE)

Author: Araja Diana
Berkis Uldis
Brenna Elenka
Cullinan John
De Korwin Jean Dominique
Gitto Lara
Hughes Dyfrig A.
Hunter Rachael M.
Pheby Derek F.H.
Trepel Dominic
Wang-Steverding Xia
Publication venue: 'MDPI AG'
Publication date: 26/02/2021
Field of study

Publisher Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland.Background and Objectives: We have conducted a survey of academic and clinical experts who are participants in the European ME/CFS Research Network (EUROMENE) to elicit perceptions of general practitioner (GP) knowledge and understanding of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) and suggestions as to how this could be improved. Materials and Methods: A questionnaire was sent to all national representatives and members of the EUROMENE Core Group and Management Committee. Survey responses were collated and then summarized based on the numbers and percentages of respondents selecting each response option, while weighted average responses were calculated for questions with numerical value response options. Free text responses were analysed using thematic analysis. Results: Overall there were 23 responses to the survey from participants across 19 different European countries, with a 95% country-level response rate. Serious concerns were expressed about GPs’ knowledge and understanding of ME/CFS, and, it was felt, about 60% of patients with ME/CFS went undiagnosed as a result. The vast majority of GPs were perceived to lack confidence in either diagnosing or managing the condition. Disbelief, and misleading illness attributions, were perceived to be widespread, and the unavailability of specialist centres to which GPs could refer patients and seek advice and support was frequently commented upon. There was widespread support for more training on ME/CFS at both undergraduate and postgraduate levels. Conclusion: The results of this survey are consistent with the existing scientific literature. ME/CFS experts report that lack of knowledge and understanding of ME/CFS among GPs is a major cause of missed and delayed diagnoses, which renders problematic attempts to determine the incidence and prevalence of the disease, and to measure its economic impact. It also contributes to the burden of disease through mismanagement in its early stages.publishersversionPeer reviewe

University of Liverpool Repository

Bucks New University: Bucks Knowledge Archive

Bangor University Research Portal

Riga Stradins university

Using linear and natural cubic splines, SITAR, and latent trajectory models to characterise nonlinear longitudinal growth trajectories in cohort studies

Author: Baxter-Jones Adam DG
Cole Tim J
Cousminer Diana L
Elhakeem Ahmed
Grant Struan FA
Hughes Rachael A
Jackowski Stefan A
Kwong Alex SF
Lawlor Deborah A
Li Zheyuan
Tilling Kate
Zemel Babette S
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

BACKGROUND: Longitudinal data analysis can improve our understanding of the influences on health trajectories across the life-course. There are a variety of statistical models which can be used, and their fitting and interpretation can be complex, particularly where there is a nonlinear trajectory. Our aim was to provide an accessible guide along with applied examples to using four sophisticated modelling procedures for describing nonlinear growth trajectories. METHODS: This expository paper provides an illustrative guide to summarising nonlinear growth trajectories for repeatedly measured continuous outcomes using (i) linear spline and (ii) natural cubic spline linear mixed-effects (LME) models, (iii) Super Imposition by Translation and Rotation (SITAR) nonlinear mixed effects models, and (iv) latent trajectory models. The underlying model for each approach, their similarities and differences, and their advantages and disadvantages are described. Their application and correct interpretation of their results is illustrated by analysing repeated bone mass measures to characterise bone growth patterns and their sex differences in three cohort studies from the UK, USA, and Canada comprising 8500 individuals and 37,000 measurements from ages 5-40 years. Recommendations for choosing a modelling approach are provided along with a discussion and signposting on further modelling extensions for analysing trajectory exposures and outcomes, and multiple cohorts. RESULTS: Linear and natural cubic spline LME models and SITAR provided similar summary of the mean bone growth trajectory and growth velocity, and the sex differences in growth patterns. Growth velocity (in grams/year) peaked during adolescence, and peaked earlier in females than males e.g., mean age at peak bone mineral content accrual from multicohort SITAR models was 12.2 years in females and 13.9 years in males. Latent trajectory models (with trajectory shapes estimated using a natural cubic spline) identified up to four subgroups of individuals with distinct trajectories throughout adolescence. CONCLUSIONS: LME models with linear and natural cubic splines, SITAR, and latent trajectory models are useful for describing nonlinear growth trajectories, and these methods can be adapted for other complex traits. Choice of method depends on the research aims, complexity of the trajectory, and available data. Scripts and synthetic datasets are provided for readers to replicate trajectory modelling and visualisation using the R statistical computing software

UCL Discovery